48 research outputs found

    MOWER : A NEW DESIGN FOR NON-BLOCKING MISPREDICTION RECOVERY

    Get PDF
    Mower is a micro-architecture technique which targets branch misprediction penalties in superscalar processors. It speeds-up the misprediction recovery process by dynamically evicting stale instructions and fixing the RAT (Register Alias Table) using explicit branch dependency tracking. Tracking branch dependencies is accomplished by using simple bit matrices. This low-overhead technique allows overlapping of the recovery process with instruction fetching, renaming and scheduling from the correct path. Our evaluation of the mechanism indicates that it yields performance very close to ideal recovery and provides up to 5% speed-up and 2% reduction in power consumption compared to a traditional recovery mechanism using a reorder buffer and a walker. The simplicity of the mechanism should permit easy implementation of Mower in an actual processor

    Mitigating the Effect of Misspeculations in Superscalar Processors

    Get PDF
    Modern superscalar processors highly rely on the speculative execution which speculatively executes instructions and then verifies. If the prediction is different from the execution result, a misspeculation recovery is performed. Misspeculation recovery penalties still account for a substantial amount of performance reduction. This work focuses on the techniques to mitigate the effect of recovery penalties and proposes practical mechanisms which are thoroughly implemented and analyzed. In general, we can divide the misspeculation penalty into four parts: misspeculation detection delay; stale instruction elimination delay; state restoration delay and pipeline fill delay. This dissertation does not consider the detection delay, instead, we design four innovative mechanisms. Some of these mechanisms target a specific recovery delay whereas others target multiple types of delay in a unified algorithm. Mower was designed to address the stale instruction elimination delay and the state restoration delay by using a special walker. When a misprediction is detected, the walker will scan and repair the instructions which are younger than the mispredicted instruction. During the walking procedure, the correct state is restored and the stale instructions are eliminated. Based on Mower, we further simplify the design and develop a Two-Phase recovery mechanism. This mechanism uses only a basic recovery mechanism except for the case in which the retire stage was stalled by a long latency instruction. When the retire stage is stalled, the second phase is launched and the instructions in the pipeline are re-fetched. Two-Phase mechanism recovers from an earlier point in the program and overlaps the recovery penalty with the long latency penalty. In reality, some of the instructions on the wrong path can be reused during the recovery. However, such reuse of misprediction results is not easy and most of the time involves significant complexity. We design Passing Loop to reduce the pipeline fill delay. We applied our mechanism only for short forward branches which eliminates a substantial amount of complexity. In terms of memory dependence speculation and associated delays due to memory ordering violations, we develop a mechanism that optimizes store-queue-free architectures. A store-queue-free architecture experiences more memory dependence mispredictions due to its aggressive approach to speculations. A common solution is to delay the execution of an instruction which is more likely to be mispredicted. We propose a mechanism to dynamically insert predicates for comparing the address of memory instructions, which is called “Dynamic Memory Dependence Predication” (DMDP). This mechanism boosts the instruction execution to its earliest point and reduces the number of mispredictions

    Decreased programmed cell death ligand 2-positive monocytic myeloid-derived suppressor cells and programmed cell death protein 1-positive T-regulatory cells in patients with type 2 diabetes: implications for immunopathogenesis

    Get PDF
    Objectives: The activation of immune cells plays a significant role in the progression of type 2 diabetes. This study aimed to investigate the potential role of myeloid-derived suppressor cells (MDSCs) and T-regulatory cells (Tregs) in type 2 diabetes. Methods: A total of 61 patients diagnosed with type 2 diabetes were recruited. Clinical characteristics were reviewed and peripheral blood samples were collected. We calculated the percentage of different cells. Frequencies of MDS C subsets refered to the percentage of G-MDSCs (CD15+CD33+CD11b+CD14-HLA-DR-/low) in CD45 positive cells and the percentage of M-MDSCs (CD14+CD15-CD11b+CD33+HLA-DR-/low) in lymphocytes plus monocytes. Results: Frequencies of programmed cell death ligand 1-positive granulo cytic MDSCs (PD-L1+ G-MDSCs), programmed cell death ligand 2-positive monocytic MDSCs (PD-L2+ M-MDSCs), PD-L2+ G-MDSC, and programmed cell death protein 1-positive Tregs (PD-1+Tregs) were decreased in patients with type 2 diabetes. The frequency of PD-1+ Tregs was positively related to PD-L2+ M-MDSCs (r = 0.357, P = 0.009) and negatively related to HbA1c (r = -0.265, P = 0.042), fasting insulin level (r = −0.260, P = 0.047), and waist circumference (r = −0.373, P = 0.005). Conclusions: Decreased PD-L2+ M-MDSCs and PD-1+ Tregs may promote effector T cell activation, leading to chronic low-grade inflammation in type 2 diabetes. These findings highlight the contribution of MDSCs and Tregs to the immunopathogenesis of type 2 diabetes and suggest their potential as targets for new therapeutic approaches

    Exploration of the hypoglycemic mechanism of Fuzhuan brick tea based on integrating global metabolomics and network pharmacology analysis

    Get PDF
    Introduction: Fuzhuan brick tea (FBT) is a worldwide popular beverage which has the appreciable potential in regulating glycometabolism. However, the reports on the hypoglycemic mechanism of FBT remain limited.Methods: In this study, the hypoglycemic effect of FBT was evaluated in a pharmacological experiment based on Kunming mice. Global metabolomics and network pharmacology were combined to discover the potential target metabolites and genes. In addition, the real-time quantitative polymerase chain reaction (RT-qPCR) analysis was performed for verification.Results: Seven potential target metabolites and six potential target genes were screened using the integrated approach. After RT-qPCR analysis, it was found that the mRNA expression of VEGFA, KDR, MAPK14, and PPARA showed significant differences between normal and diabetes mellitus mice, with a retracement after FBT treatment.Conclusion: These results indicated that the hypoglycemic effect of FBT was associated with its anti-inflammatory activities and regulation of lipid metabolism disorders. The exploration of the hypoglycemic mechanism of FBT would be meaningful for its further application and development

    CHES: a space-borne astrometric mission for the detection of habitable planets of the nearby solar-type stars

    Full text link
    The Closeby Habitable Exoplanet Survey (CHES) mission is proposed to discover habitable-zone Earth-like planets of the nearby solar-type stars (10 pc\sim 10~\mathrm{pc} away from our solar system) via micro-arcsecond relative astrometry. The major scientific objectives of CHES are: to search for Earth Twins or terrestrial planets in habitable zones orbiting 100 FGK nearby stars; further to conduct a comprehensive survey and extensively characterize the nearby planetary systems. The primary payload is a high-quality, low-distortion, high-stability telescope. The optical subsystem is a coaxial three-mirror anastigmat (TMA) with a 1.2 m1.2 \mathrm{~m}-aperture, 0.44×0.440.44^{\circ} \times 0.44^{\circ} field of view and 500 nm900 nm500 \mathrm{~nm}-900 \mathrm{~nm} working waveband. The camera focal plane is composed of 81 MOSAIC scientific CMOS detectors each with 4 K×4 K4 \mathrm{~K} \times 4 \mathrm{~K} pixels. The heterodyne laser interferometric calibration technology is employed to ensure micro-arcsecond level (1 μ\muas) relative astrometry precision to meet the requirements for detection of Earth-like planets. CHES satellite operates at the Sun-Earth L2 point and observes the entire target stars for 5 years. CHES will offer the first direct measurements of true masses and inclinations of Earth Twins and super-Earths orbiting our neighbor stars based on micro-arcsecond astrometry from space. This will definitely enhance our understanding of the formation of diverse nearby planetary systems and the emergence of other worlds for solar-type stars, and finally to reflect the evolution of our own solar system.Comment: 39 pages, 37 figures, Invited Review, accepted to Research in Astronomy and Astrophysic

    Dynamic memory dependence predication

    No full text
    Store-queue-free architectures remove the store queue and use memory cloaking to communicate in-flight stores instead. In these architectures, frequent mispredictions may occur when the store to load dependencies are inconsistent. We present DMDP (Dynamic Memory Dependence Predication) which modifies the microarchitecture behavior for such loads to mitigate memory dependence mispredictions. When a given dependence is hard to predict, i.e., a given load occasionally depends on a particular store, but it is independent at other times, DMDP predicates the load so that the address of the load is compared with the address of the predicted store to compute a predicate. This predicate guides the load to obtain the value from either the cache or the colliding store. The predication provided by DMDP i) enables the loads and their dependent instructions to execute much earlier, ii) reduces the hardware complexity of store-queue-free mechanisms, and iii) reduces the number of mispredictions. DMDP outperforms a state-of-the-art store-queue-free architecture by 7.17% on Integer benchmarks and 4.48% on Float benchmarks in our Spec 2006 evaluation. We further show that despite executing extra predication instructions, DMDP is power efficient as it saves about 6.7% on EDP

    Mower: A new design for non-blocking misprediction recovery

    No full text
    Mower is a micro-architecture technique which targets the branch misprediction penalty in superscalar processors. It speeds-up the misprediction recovery process by dynamically evicting stale instructions and correcting the Register Alias Table (RAT) using explicit control dependency tracking. Tracking control dependencies is accomplished by using simple bit matrices. This low-overhead technique allows overlapping of the recovery process with instruction fetching, renaming and scheduling from the correct path. Our evaluation of the mechanism indicates that it yields performance very close to ideal recovery and provides up to 5% speed-up and 2% reduction in power consumption compared to a recovery mechanism using a reorder buffer and a walker. The simplicity of the mechanism should permit easy implementation of Mower in an actual processor

    Identification of Novel Hub Genes Associated with Psoriasis Using Integrated Bioinformatics Analysis

    No full text
    Psoriasis is a chronic, prolonged, and recurrent inflammatory skin disease and the current therapeutics can only alleviate the symptoms rather than cure it completely. Therefore, we aimed to identify the molecular signatures and specific biomarkers of psoriasis to provide novel clues for psoriasis and targeted therapy. In the present study, the Gene Expression Omnibus (GEO) database was used to retrieve three microarray datasets (GSE166388, GSE50790 and GSE42632) and to explore the differentially expressed genes (DEGs) in psoriasis using the Affy package in R software. The gene ontology (GO) and Kyoto Encyclopedia of Gene and Genome (KEGG) pathway enrichment were utilized to determine the common DEGs and their capabilities. The STRING database was used to develop DEG-encoded proteins and a protein–protein interaction network (PPI) and the Cytohubba plugin to classify hub genes. Using the NetworkAnalyst platform, we detected transcription factors (TFs), microRNAs and drug candidates interacting with hub genes. In addition, the expression levels of hub genes in HaCaT cells were detected by western blot. We screened the up- and downregulated DEGs from the transcriptome microarrays of corresponding psoriasis patients. Functional enrichment of DEGs in psoriasis was mainly associated with positive regulation of leukocyte cell–cell adhesion and T cell activation, cytokine binding, cytokine activity and the Wnt signaling pathway. Through further data processing, we obtained 57 intersecting genes in the three datasets and probed them in STRING to determine the interaction of their expressed proteins and we obtained the critical 10 hub genes in the Cytohubba plugin, including TOP2A, CDKN3, MCM10, PBK, HMMR, CEP55, ASPM, KIAA0101, ESC02, and IL-1β. Using these hub genes as targets, we obtained 35 TFs and 213 miRNAs that may regulate these genes and 33 potential therapeutic agents for psoriasis. Furthermore, the expression levels of TOP2A, MCM10, PBK, ASPM, KIAA0101 and IL-1β were observably increased in HaCaT cells. In conclusion, we identified potential biomarkers, risk factors and drugs for psoriasis

    Decoupling address generation from loads and stores to improve data access energy efficiency

    No full text
    Level-one data cache (L1 DC) accesses impact energy usage as they frequently occur and use significantly more energy than register file accesses. A memory access instruction consists of an address generation operation calculating the location where the data item resides in memory and the data access operation that loads/stores a value from/to that location. We propose to decouple these two operations into separate machine instructions to reduce energy usage. By associating the data translation lookaside buffer (DTLB) access and level-one data cache (L1 DC) tag check with an address generation instruction, only a single data array in a set-associative L1 DC needs to be accessed during a load instruction when the result of the tag check is known at that point. In addition, many DTLB accesses and L1 DC tag checks are avoided by memoizing the DTLB way and L1 DC way with the register that holds the memory address to be dereferenced. Finally, we are able to often coalesce an ALU operation with a load or store data access using our technique to reduce the number of instructions executed
    corecore